Optimal Policies for a Sequential Decision Process
نویسندگان
چکیده
منابع مشابه
Constrained Markov Decision Process and Optimal Policies
In the course lectures, we have discussed a lot regarding unconstrained Markov Decision Process (MDP). The dynamic programming decomposition and optimal policies with MDP are also given. However, in this report we are going to discuss a different MDP model, which is constrained MDP. There are many realistic demand of studying constrained MDP. For instance, in the wireless sensors networks, each...
متن کاملConvergence in a sequential two stages decision making process
We analyze a sequential decision making process, in which at each stepthe decision is made in two stages. In the rst stage a partially optimalaction is chosen, which allows the decision maker to learn how to improveit under the new environment. We show how inertia (cost of changing)may lead the process to converge to a routine where no further changesare made. We illustrate our scheme with some...
متن کاملOptimal sequential inspection policies
We consider the problem of combining a given set of diagnostic tests into an inspection system that can classify items of interest (cases) with maximum accuracy subject to budget constraints. One motivating application is sequencing diagnostic tests for container inspection problems, where the diagnostic tests may correspond to radiation sensors, documents checks, or imaging systems. We conside...
متن کاملOptimal Adaptive Policies for Sequential Allocation Problems
p Ž . 0 Ž . Ž . Ž . V u under any policy p in C is equal to nm* u y M u log n q o log n , as n R Ž . Ž . Ž . n a `, where m* u is the largest population mean and M u is a constant. ii Policies in C are asymptotically optimal within a larger class C of ‘‘uniformly R UF 0 p Ž Ž . Ž .. fast convergent’’ policies in the sense that limna ` nm* u y V u r n Ž Ž . pŽ .. Ž . nm* u y V u F 1, for any p g...
متن کاملconvergence in a sequential two stages decision making process
we analyze a sequential decision making process, in which at each stepthe decision is made in two stages. in the rst stage a partially optimalaction is chosen, which allows the decision maker to learn how to improveit under the new environment. we show how inertia (cost of changing)may lead the process to converge to a routine where no further changesare made. we illustrate our scheme with som...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of the Society for Industrial and Applied Mathematics
سال: 1965
ISSN: 0368-4245,2168-3484
DOI: 10.1137/0113003